Search CORE

244 research outputs found

CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules

Author: Bertolazzi Paola
Cestarelli Valerio
FELICI GIOVANNI
FISCON GIULIA
Weitschek Emanuel
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

Nowadays, knowledge extraction methods from Next Generation Sequencing data are highly requested. In this work, we focus on RNA-seq gene expression analysis and specifically on case-control studies with rule-based supervised classification algorithms that build a model able to discriminate cases from controls. State of the art algorithms compute a single classification model that contains few features (genes). On the contrary, our goal is to elicit a higher amount of knowledge by computing many classification models, and therefore to identify most of the genes related to the predicted class

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Predicting long-term publication impact through a combination of early citations and journal impact factor

Author: Abramo Giovanni
D'Angelo Ciriaco Andrea
Felici Giovanni
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

The ability to predict the long-term impact of a scientific article soon after its publication is of great value towards accurate assessment of research performance. In this work we test the hypothesis that good predictions of long-term citation counts can be obtained through a combination of a publication's early citations and the impact factor of the hosting journal. The test is performed on a corpus of 123,128 WoS publications authored by Italian scientists, using linear regression models. The average accuracy of the prediction is good for citation time windows above two years, decreases for lowly-cited publications, and varies across disciplines. As expected, the role of the impact factor in the combination becomes negligible after only two years from publication

arXiv.org e-Print Archive

ART

Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers

Author: Ateniese Giuseppe
Felici Giovanni
Mancini Luigi V.
Spognardi Angelo
Villani Antonio
Vitali Domenico
Publication venue
Publication date: 19/06/2013
Field of study

Machine Learning (ML) algorithms are used to train computers to perform a variety of complex tasks and improve with experience. Computers learn how to recognize patterns, make unintended decisions, or react to a dynamic environment. Certain trained machines may be more effective than others because they are based on more suitable ML algorithms or because they were trained through superior training sets. Although ML algorithms are known and publicly released, training sets may not be reasonably ascertainable and, indeed, may be guarded as trade secrets. While much research has been performed about the privacy of the elements of training sets, in this paper we focus our attention on ML classifiers and on the statistical information that can be unconsciously or maliciously revealed from them. We show that it is possible to infer unexpected but useful information from ML classifiers. In particular, we build a novel meta-classifier and train it to hack other classifiers, obtaining meaningful information about their training sets. This kind of information leakage can be exploited, for example, by a vendor to build more effective classifiers or to simply acquire trade secrets from a competitor's apparatus, potentially violating its intellectual property rights

arXiv.org e-Print Archive

CiteSeerX

Common operation scheduling with general processing times: A branch-and-cut algorithm to minimize the weighted number of tardy jobs

Author: Claudio Arbib
Giovanni Felici
Mara Servilio
Publication venue
Publication date: 01/04/2019
Field of study

Common operation scheduling (COS) problems arise in real-world applications, such as industrial processes of material cutting or component dismantling. In COS, distinct jobs may share operations, and when an operation is done, it is done for all the jobs that share it. We here propose a 0-1 LP formulation with exponentially many inequalities to minimize the weighted number of tardy jobs. Separation of inequalities is in NP, provided that an ordinary min Lmax scheduling problem is in P. We develop a branch-and-cut algorithm for two cases: one machine with precedence relation; identical parallel machines with unit operation times. In these cases separation is the constrained maximization of a submodular set function. A previous method is modified to tackle the two cases, and compared to our algorithm. We report on tests conducted on both industrial and artificial instances. For single machine and general processing times the new method definitely outperforms the other, extending in this way the range of COS applications

Crossref

Open Access Repository

Learning to classify species with barcodes

Author: Bertolazzi Paola
Felici Giovanni
Weitschek Emanuel
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

A stochastic estimated version of the Italian dynamic General Equilibrium Model (IGEM)

Author: Acocella Nicola
Alleva Giorgio
Beqiraj Elton
Di Bartolomeo Giovanni
DI DIO Fabio
Di Pietro Marco
Francesco Felici Francesco
Liseo Brunero
Publication venue: place:Roma
Publication date: 01/01/2018
Field of study

We estimate with Bayesian techniques the Italian dynamic General Equilibrium Model (IGEM), which has been developed at the Italian Treasury Department, Ministry of Economy and Finance, to assess the effects of alter-native policy interventions. We analyze and discuss the estimated effects of various shocks on the Italian economy. Compared to the calibrated version used for policy analysis, we find a lower wage rigidity and higher adjustment costs. The degree of prices and wages indexation to past inflation is much smaller than the indexation level assumed in the calibrated model. No substantial difference is found in the estimated monetary parameters. Estimated fiscal multipliers are slightly smaller than those obtained from the calibrated version of the model

Archivio della ricerca- Università di Roma La Sapienza

LAF : Logic Alignment Free and its application to bacterial genomes classification

Author: Cunial Fabio
Felici Giovanni
Weitschek Emanuel
Publication venue
Publication date: 01/01/2015
Field of study

Alignment-free algorithms can be used to estimate the similarity of biological sequences and hence are often applied to the phylogenetic reconstruction of genomes. Most of these algorithms rely on comparing the frequency of all the distinct substrings of fixed length (k-mers) that occur in the analyzed sequences. In this paper, we present Logic Alignment Free (LAF), a method that combines alignment-free techniques and rule-based classification algorithms in order to assign biological samples to their taxa. This method searches for a minimal subset of k-mers whose relative frequencies are used to build classification models as disjunctive-normal-form logic formulas (if-then rules). We apply LAF successfully to the classification of bacterial genomes to their corresponding taxonomy. In particular, we succeed in obtaining reliable classification at different taxonomic levels by extracting a handful of rules, each one based on the frequency of just few k-mers. State of the art methods to adjust the frequency of k-mers to the character distribution of the underlying genomes have negligible impact on classification performance, suggesting that the signal of each class is strong and that LAF is effective in identifying it.Peer reviewe

Springer - Publisher Connector

PubMed Central

Helsingin yliopiston digitaalinen arkisto